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Hamiltonian dynamics with partial momentum refreshment, in the style of 
|Horowitz, 1991| , explore the state space more slowly than they otherwise would 
due to the momentum reversals which occur on proposal rejection. These cause 
trajectories to double back on themselves, leading to random walk behavior on 
timescales longer than the typical rejection time, and leading to slower mixing. 
I present a technique by which the number of momentum reversals can be re- 
duced. This is accomplished by maintaining the net exchange of probability 
between states with opposite momenta, but reducing the rate of exchange in 
both directions such that it is in one direction. An experiment illustrates 
these reduced momentum flips accelerating mixing for a particular distribution. 

1 Formalism 

A state £ € R Nx2 consists of a position x £ 1Z N and an auxiliary momentum 
veR A, ,( = {x,v}. The state space has an associated Hamiltonian 

iJ(C) = £(x) + lv T v, (1) 
and a joint probability distribution 

p(x,v)=p(C) = |exp(-#(C)), (2) 

where the normalization constant Z is the partition function. 

The momentum flip operator F : lZ Nx2 — > TZ Nx2 negates the momentum. 
It has the properties: 

• F negates the momentum, F( = F {x, v} = {x, — v} 

• F is its own inverse, F^ 1 = F, FF( = (■ 



• F is volume preserving, det 




• F doesn't change the probability of a state, p (£) = p (F£) 

The leapfrog integrator L(n,e) : 7Z Nx2 —> lZ Nx2 integrates Hamiltonian 
dynamics for the Hamiltonian H (£), using leapfrog integration, for n G Z + 
integration steps with stepsize e € 1Z + . We assume that n and e are constants, 
and write this operator simply as L. The leapfrog integrator L has the following 
relevant properties: 



• L is volume preserving, det 




• L is exactly reversible using momentum flips, L 1 = FLF, ( = FLFL( 

During sampling, state updates are performed using a transition operator 
T(r) : K Nx2 -> K Nx2 , where r ~ U ([0, 1)) is drawn from the uniform distri- 
bution between and 1, 

f L( r<P leap (C) 
T(r)C={ F( Piea P <r <Pi eap (()+Pf Up (0 ■ (3) 

I C Pleap + Pflip (C) < r 

T (r) additionally depends on an acceptance probability for the leapfrog dynam- 
ics, Pi ea p (C) € [0, 1], and a probability of negating the momentum, Pfu p (£) € 
[0, f — Pieap (C)]- These must be chosen to guarantee that p (£) is a fixed point 
of TR 



2 Making the distribution of interest a fixed point 

In order to make p (£) a fixed point, we will choose the Markov dynamics T so 
that on average as many transitions enter as leave state £ at equilibrium. This 
is not pairwise detailed balance — instead we are directly enforcing zero net 
change in the probability of each state by summing over all allowed transitions 
into or out of the state. This constraint is analogous to Kirchhoff 's current law, 
where the total current entering a node is set to 0. As can be seen from Equation 
[3] and the definitions in Section [l] and as is illustrated in Figure [l] a state £ can 
only lose probability to the two states L£ and F£, and gain probability from 
the two states and F^ 1 ^. Equating the rates of probability inflow and 

outflow, we find 

P (0 Pleap (C) + P (0 Pflip (0 = P (L~\) P,eap (L~\) + p (F"^) PfHp (i^C) 

(4) 

= p {L~\) Pleap (L-\) + p (C) Pflip (FQ (5) 
PfUp (0 - Pflip (F() = J' Pieap (L-\) ~ Pleap (C) ■ (6) 

1 This fixed point requirement can be written as p (£) = f dQ'p (£') dr5 (( — T (r) £ '). 
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Figure 1: This diagram illustrates the possible transitions between states using 
the Markov transition operator from Equation [3j In (a) the relevant states, 
represented by the nodes, are labeled. In (b) the possible transitions, represented 
by the arrows, are labeled. In Section [2j the net probability flow into and out 
of the state ( is set to 0. 
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Figure 2: A two dimensional image of the distribution used in Section [3] Pixel 
intensity corresponds to the probability density function at that location. 



We choose the standard Metropolis-Hastings acceptance rules for Pi eap (£), 
P w ( C )=mm(l,^). (7) 
Substituting this in to Equation [6j we find 

(o - (po - >-<gL mi » (,, >Jgm - mi » (, em) «s, 



' p(L-\) \ . / p(L()\ 

p(0 J v p(o ; 



(9) 



= mi H 1 'TicrJ" mi H 1 wJ- (10) 

Satisfying Equation 10 we choos^jthe following form for Pfu p (C), 

Pflip (C) = max 10, mm II, ^ I - mm II, ^ I I . (11) 

Note that P/h p (C) < 1 — Pieap (C)j where 1 — Pi eap (C) is the rejection rate, and 
thus the momentum flip rate, in standard HMC. Using this form for Pfu p (C) 
will generally reduce the number of momentum flips required. 



2 To recover standard HMC, instead set Pfu p (C) = 1 — Pieap (C)- One can verify by 
substitution that this satisfies Equation [To] 
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Figure 3: The covariance between samples as a function of the number of 
intervening sampling steps for HMC with standard rejection and rejection with 
fewer momentum reversals. Reducing the number of momentum reversals causes 
faster mixing, as evidenced by the faster falloff of the autocovariance. 

3 Example 

In order to demonstrate the accelerated mixing provided by this technique, 
samples were drawn from a simple distribution with standard rejection, and 
with separate rejection and momentum flipping rates as described above. In 
both cases, the leapfrog step length e was set to 0.1, the number of integration 
steps n was set to 1, and the momentum corruption rate ft was set so as to 
corrupt half the momentum per unit stimulation time. Both samplers were run 
for 100, 000 sampling steps. The distribution used was described by the energy 
function 

E = \00\og 2 (^xl + xi^. (12) 

A 2 dimensional image of this distribution can be seen in Figure [2j The au- 
tocovariance of the returned samples can be seen, as a function of the number 
of intervening sampling steps, in Figure [3] Sampling using the technique pre- 
sented here led to more rapid decay of the autocovariance, consistent with faster 
mixing. 
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